3,175 research outputs found

    Change-point model on nonhomogeneous Poisson processes with application in copy number profiling by next-generation DNA sequencing

    Get PDF
    We propose a flexible change-point model for inhomogeneous Poisson Processes, which arise naturally from next-generation DNA sequencing, and derive score and generalized likelihood statistics for shifts in intensity functions. We construct a modified Bayesian information criterion (mBIC) to guide model selection, and point-wise approximate Bayesian confidence intervals for assessing the confidence in the segmentation. The model is applied to DNA Copy Number profiling with sequencing data and evaluated on simulated spike-in and real data sets.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS517 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Detecting simultaneous variant intervals in aligned sequences

    Get PDF
    Given a set of aligned sequences of independent noisy observations, we are concerned with detecting intervals where the mean values of the observations change simultaneously in a subset of the sequences. The intervals of changed means are typically short relative to the length of the sequences, the subset where the change occurs, the "carriers," can be relatively small, and the sizes of the changes can vary from one sequence to another. This problem is motivated by the scientific problem of detecting inherited copy number variants in aligned DNA samples. We suggest a statistic based on the assumption that for any given interval of changed means there is a given fraction of samples that carry the change. We derive an analytic approximation for the false positive error probability of a scan, which is shown by simulations to be reasonably accurate. We show that the new method usually improves on methods that analyze a single sample at a time and on our earlier multi-sample method, which is most efficient when the carriers form a large fraction of the set of sequences. The proposed procedure is also shown to be robust with respect to the assumed fraction of carriers of the changes.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS400 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Detecting mutations in mixed sample sequencing data using empirical Bayes

    Get PDF
    We develop statistically based methods to detect single nucleotide DNA mutations in next generation sequencing data. Sequencing generates counts of the number of times each base was observed at hundreds of thousands to billions of genome positions in each sample. Using these counts to detect mutations is challenging because mutations may have very low prevalence and sequencing error rates vary dramatically by genome position. The discreteness of sequencing data also creates a difficult multiple testing problem: current false discovery rate methods are designed for continuous data, and work poorly, if at all, on discrete data. We show that a simple randomization technique lets us use continuous false discovery rate methods on discrete data. Our approach is a useful way to estimate false discovery rates for any collection of discrete test statistics, and is hence not limited to sequencing data. We then use an empirical Bayes model to capture different sources of variation in sequencing error rates. The resulting method outperforms existing detection approaches on example data sets.Comment: Published in at http://dx.doi.org/10.1214/12-AOAS538 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Hoxb2 and Hoxb4 Act Together to Specify Ventral Body Wall Formation

    Get PDF
    AbstractThree different alleles of the Hoxb4 locus were generated by gene targeting in mice. Two alleles contain insertions of a selectable marker in the first exon in either orientation, and, in the third, the selectable marker was removed, resulting in premature termination of the protein. Presence and orientation of the selectable marker correlated with the severity of the phenotype, indicating that the selectable marker induces cis effects on neighboring genes that influence the phenotype. Homozygous mutants of all alleles had cervical skeletal defects similar to those previously reported for Hoxb4 mutant mice. In the most severe allele, Hoxb4PolII, homozygous mutants died eitherin utero at approximately E15.5 or immediately after birth, with a severe defect in ventral body wall formation. Analysis of embryos showed thinning of the primary ventral body wall in mutants relative to control animals at E11.5, before secondary body wall formation. Prior to this defect, both Alx3 and Alx4 were specifically down regulated in the most ventral part of the primary body wall in Hoxb4PolII mutants. Hoxb4loxp mutants in which theneo gene has been removed did not have body wall or sternum defects. In contrast, both the Hoxb4PolII and the previously described Hoxb2PolII alleles that have body wall defects have been shown to disrupt the expression of bothHoxb2 and Hoxb4 in cell types that contribute to body wall formation. Our results are consistent with a model in which defects in ventral body wall formation require the simultaneous loss of at least Hoxb2 and Hoxb4, and may involve Alx3 and Alx4
    • …
    corecore